Add CloudWatch SEARCH() expression support for dynamic metric alarms by jaredbrook · Pull Request #143 · base2Services/cfn-guardian

jaredbrook · 2026-04-17T01:48:58Z

No description provided.

Copilot

Pull request overview

Adds support for CloudWatch SEARCH()-based dynamic metric alarms so alarms can target changing resources (e.g., replaced Auto Scaling Groups) without requiring fixed dimensions.

Changes:

Extend alarm models to carry search_expression / search_aggregation, plus interpolation support for ${Resource::...} inside search expressions.
Emit CloudFormation alarms using Metrics (metric math) when search_expression is present, and relax validation to not require MetricName/Namespace in that case.
Add documentation and RSpec coverage for search-expression alarms; bump gem version/dependencies.

Reviewed changes

Copilot reviewed 9 out of 10 changed files in this pull request and generated 7 comments.

Show a summary per file

File	Description
spec/search_expression_spec.rb	New specs covering model fields, template emission, interpolation, and compile-time validation behavior
lib/cfnguardian/models/alarm.rb	Add `search_expression` / `search_aggregation` fields to alarms
lib/cfnguardian/resources/base.rb	Interpolate `${Resource::...}` variables inside search expressions
lib/cfnguardian/stacks/resources.rb	Emit `AWS::CloudWatch::Alarm` using `Metrics`/expressions when `search_expression` is set
lib/cfnguardian/compile.rb	Skip `metric_name`/`namespace` validation for search-expression alarms; validate empty SearchExpression
docs/search_expressions.md	New documentation describing configuration and limitations
docs/variables.md	Document variable support inside search expressions
docs/overview.md	Add Search Expression Alarms to docs TOC
lib/cfnguardian/version.rb	Bump version to 0.13.0
Gemfile.lock	Lockfile updates for new version and dependency resolutions

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T01:53:23Z

@@ -0,0 +1,215 @@
+require 'spec_helper'
+require 'json'
+require 'yaml'


This spec uses FileUtils.mkdir_p/FileUtils.rm_f later but never require 'fileutils'. Without it, the spec will raise NameError: uninitialized constant FileUtils in a clean test environment. Add the missing require (or avoid FileUtils usage).

Suggested change

require 'yaml'

require 'yaml'

require 'fileutils'

Copilot · 2026-04-17T01:53:23Z

+          expect(search_alarms.length).to eq(1)
+          expect(search_alarms.first.search_expression).to include('SEARCH')
+        ensure
+          FileUtils.rm_f(fixture)


The spec creates spec/fixtures via FileUtils.mkdir_p(config_dir) but only removes the YAML file in the ensure block. This leaves the fixtures directory behind after the test run. Consider removing the directory as well (when empty) or using a temp directory (e.g., Dir.mktmpdir) to avoid polluting the working tree.

Suggested change

FileUtils.rm_f(fixture)

FileUtils.rm_f(fixture)

FileUtils.rmdir(config_dir) if Dir.exist?(config_dir) && Dir.children(config_dir).empty?

Copilot · 2026-04-17T01:53:23Z

+        # String interpolation for search expressions
+        if alarm.search_expression.is_a?(String)
+          alarm.search_expression = alarm.search_expression.gsub(/\${Resource::([A-Za-z]+)}/) do
+            resource_key = $1
            if @resource.has_key?(resource_key)
-              logger.debug "overriding alarm #{alarm.name} dimension key '#{k}' with value '#{@resource[resource_key]}'" 
-              alarm.dimensions[k] = @resource[resource_key]
+              logger.debug "interpolating search_expression variable '#{resource_key}' with value '#{@resource[resource_key]}' for alarm #{alarm.name}"
+              @resource[resource_key]
+            else
+              "${Resource::#{resource_key}}"
            end


Search expression interpolation only matches ${Resource::...} keys made of letters (/\${Resource::([A-Za-z]+)}/). This is more restrictive than the dimension-variable interpolation above and contradicts the docs that say “any key from the resource definition” (keys may contain digits/underscores). Consider broadening the regex (e.g., allow [A-Za-z0-9_]+), and prefer Regexp.last_match(1) over $1 for clarity.

Copilot · 2026-04-17T01:53:24Z

+          if resource.search_expression
+            if resource.search_expression.empty?
+              @errors << "CfnGuardian::AlarmPropertyError - alarm #{resource.name} for resource #{resource.resource_id} has an empty SearchExpression."
+            end


resource.search_expression.empty? assumes search_expression is a String. If a user misconfigures YAML (e.g., SearchExpression: []), this will raise a NoMethodError during validation instead of producing a helpful validation message. Consider validating search_expression.is_a?(String) and treating blank/whitespace-only strings as invalid (e.g., strip.empty?) to keep validation robust.

Copilot · 2026-04-17T01:53:24Z

+      def add_search_expression_alarm(alarm, actions)
+        search_expr = alarm.search_expression
+        aggregation = alarm.search_aggregation || 'MAX'
+
+        @template.declare do
+          CloudWatch_Alarm("#{alarm.resource_hash}#{alarm.group}#{alarm.name.gsub(/[^0-9a-zA-Z]/i, '')}#{alarm.type}"[0..255]) do
+            ActionsEnabled true
+            AlarmDescription "Guardian alarm #{alarm.name} for the resource #{alarm.resource_id} in alarm group #{alarm.group}"
+            AlarmName CfnGuardian::CloudWatch.get_alarm_name(alarm)
+            ComparisonOperator alarm.comparison_operator
+            EvaluationPeriods alarm.evaluation_periods
+            Threshold alarm.threshold
+            AlarmActions actions
+            OKActions actions unless alarm.ok_action_disabled
+            TreatMissingData alarm.treat_missing_data unless alarm.treat_missing_data.nil?
+            DatapointsToAlarm alarm.datapoints_to_alarm unless alarm.datapoints_to_alarm.nil?
+            Metrics [
+              {
+                Id: 'search_expression',
+                Expression: search_expr,
+                ReturnData: false
+              },
+              {
+                Id: 'aggregate',
+                Expression: "#{aggregation}(search_expression)",
+                ReturnData: true
+              }


search_aggregation is inserted directly into the metric-math expression ("#{aggregation}(search_expression)") without validation/normalization. A typo or lowercase value will generate an invalid CloudWatch expression and fail at deploy time. Consider upcasing the value and validating it against the supported set (MAX/MIN/AVG/SUM), surfacing a compile-time validation error when invalid.

Copilot · 2026-04-17T01:53:24Z

@@ -57,6 +65,38 @@ def add_alarm(alarm)
          end
        end
      end
+
+      def add_search_expression_alarm(alarm, actions)
+        search_expr = alarm.search_expression
+        aggregation = alarm.search_aggregation || 'MAX'
+
+        @template.declare do
+          CloudWatch_Alarm("#{alarm.resource_hash}#{alarm.group}#{alarm.name.gsub(/[^0-9a-zA-Z]/i, '')}#{alarm.type}"[0..255]) do
+            ActionsEnabled true
+            AlarmDescription "Guardian alarm #{alarm.name} for the resource #{alarm.resource_id} in alarm group #{alarm.group}"
+            AlarmName CfnGuardian::CloudWatch.get_alarm_name(alarm)
+            ComparisonOperator alarm.comparison_operator
+            EvaluationPeriods alarm.evaluation_periods
+            Threshold alarm.threshold
+            AlarmActions actions
+            OKActions actions unless alarm.ok_action_disabled
+            TreatMissingData alarm.treat_missing_data unless alarm.treat_missing_data.nil?
+            DatapointsToAlarm alarm.datapoints_to_alarm unless alarm.datapoints_to_alarm.nil?
+            Metrics [
+              {
+                Id: 'search_expression',
+                Expression: search_expr,
+                ReturnData: false
+              },
+              {
+                Id: 'aggregate',
+                Expression: "#{aggregation}(search_expression)",
+                ReturnData: true
+              }
+            ]
+          end
+        end
+      end


add_search_expression_alarm duplicates most of the CloudWatch_Alarm property wiring from add_standard_alarm, which increases the chance of future drift (e.g., tags/unit/low-sample settings). Consider extracting the common alarm property assignment into a shared helper, and only branching for the mutually-exclusive metric configuration (Dimensions/MetricName/etc vs Metrics).

Copilot · 2026-04-17T01:53:24Z

+Instead of emitting a CloudWatch alarm with fixed `Dimensions`, `MetricName`, `Namespace`, and `Statistic` properties, a search expression alarm emits `MetricDataQueries` with:
+
+1. A **SEARCH()** expression that dynamically matches metrics by partial or exact name
+2. An **aggregation function** (e.g. `MAX`, `AVG`, `SUM`) that reduces the matched metrics to a single time series for threshold evaluation
+
+## Configuration
+
+Add `SearchExpression` and optionally `SearchAggregation` to an alarm template. When `SearchExpression` is set, the `Dimensions`, `MetricName`, `Namespace`, `Statistic`, and `Period` properties are not used since CloudWatch treats these as mutually exclusive with `MetricDataQueries`.


This doc says Guardian “emits MetricDataQueries”, but the CloudFormation AWS::CloudWatch::Alarm property name is Metrics (a list of MetricDataQuery objects). Consider adjusting wording to avoid confusion for readers cross-referencing the CFN spec.

Suggested change

Instead of emitting a CloudWatch alarm with fixed `Dimensions`, `MetricName`, `Namespace`, and `Statistic` properties, a search expression alarm emits `MetricDataQueries` with:

1. A **SEARCH()** expression that dynamically matches metrics by partial or exact name

2. An **aggregation function** (e.g. `MAX`, `AVG`, `SUM`) that reduces the matched metrics to a single time series for threshold evaluation

## Configuration

Add `SearchExpression` and optionally `SearchAggregation` to an alarm template. When `SearchExpression` is set, the `Dimensions`, `MetricName`, `Namespace`, `Statistic`, and `Period` properties are not used since CloudWatch treats these as mutually exclusive with `MetricDataQueries`.

Instead of emitting a CloudWatch alarm with fixed `Dimensions`, `MetricName`, `Namespace`, and `Statistic` properties, a search expression alarm emits the CloudFormation `Metrics` property (a list of `MetricDataQuery` objects) with:

1. A **SEARCH()** expression that dynamically matches metrics by partial or exact name

2. An **aggregation function** (e.g. `MAX`, `AVG`, `SUM`) that reduces the matched metrics to a single time series for threshold evaluation

## Configuration

Add `SearchExpression` and optionally `SearchAggregation` to an alarm template. When `SearchExpression` is set, the `Dimensions`, `MetricName`, `Namespace`, `Statistic`, and `Period` properties are not used since CloudWatch treats these as mutually exclusive with the alarm `Metrics` property.

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T02:09:34Z

+    context 'when search expression alarm has no metric_name or namespace' do
+      it 'does not raise validation errors' do
+        fixture = File.join(config_dir, 'search_expression_alarms.yaml')
+        FileUtils.mkdir_p(config_dir)
+        File.write(fixture, {
+          'Resources' => {
+            'AutoScalingGroup' => [{ 'Id' => 'my-app-AsgGroup-abc123' }]
+          },
+          'Templates' => {
+            'AutoScalingGroup' => {
+              'CPUUtilizationHighBase' => {
+                'SearchExpression' => "SEARCH('{AWS/EC2,AutoScalingGroupName} MetricName=\"CPUUtilization\" my-app', 'Minimum', 60)",
+                'SearchAggregation' => 'MAX'
+              },
+              'StatusCheckFailed' => false
+            }


This example is described as having “no metric_name or namespace”, but it overrides an existing AutoScalingGroup default alarm (CPUUtilizationHighBase) that already sets both fields, so it doesn’t actually exercise the validation change. Consider adding a new alarm (one that doesn’t exist in defaults) with only SearchExpression set, and assert that validation passes and the created alarm has metric_name (and/or namespace) still nil.

Copilot · 2026-04-17T02:09:35Z

      run: |
-        gem install rspec
-        rspec
+        gem install bundler -v 2.4.22


The workflow installs Bundler 2.4.22, but the lockfile was generated with Bundler 2.3.19 (BUNDLED WITH). This often produces warnings and may cause unintended lockfile churn. Consider aligning the workflow Bundler version with the lockfile, or update the lockfile’s BUNDLED WITH to match the workflow version.

Suggested change

gem install bundler -v 2.4.22

gem install bundler -v 2.3.19

Copilot · 2026-04-17T02:09:35Z

+            Metrics [
+              {
+                Id: 'search_expression',
+                Expression: search_expr,
+                ReturnData: false


When using the CloudFormation Metrics property (metric math alarms), ThresholdMetricId is required to tell CloudWatch which query ID to evaluate against Threshold. Without it, the generated alarm resource may fail template validation or evaluate an unexpected metric. Set ThresholdMetricId to the ID of the return-data query (likely aggregate).

Copilot · 2026-04-17T02:09:35Z

-              logger.debug "overriding alarm #{alarm.name} dimension key '#{k}' with value '#{@resource[resource_key]}'" 
-              alarm.dimensions[k] = @resource[resource_key]
+              logger.debug "interpolating search_expression variable '#{resource_key}' with value '#{@resource[resource_key]}' for alarm #{alarm.name}"
+              @resource[resource_key]


The gsub replacement block must return a String. If a resource value is numeric/boolean (e.g., a Port), returning it directly will raise TypeError during interpolation. Convert interpolated values to strings (e.g., @resource[resource_key].to_s) before returning from the block.

Suggested change

@resource[resource_key]

@resource[resource_key].to_s

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 4 comments.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T02:20:45Z

+          if resource.search_expression
+            if !resource.search_expression.is_a?(String) || resource.search_expression.strip.empty?
+              @errors << "CfnGuardian::AlarmPropertyError - alarm #{resource.name} for resource #{resource.resource_id} has an invalid SearchExpression. Must be a non-empty string."
+            end
+            if resource.search_aggregation
+              valid_aggregations = %w(MAX MIN AVG SUM)
+              normalized = resource.search_aggregation.to_s.upcase
+              if valid_aggregations.include?(normalized)
+                resource.search_aggregation = normalized
+              else
+                @errors << "CfnGuardian::AlarmPropertyError - alarm #{resource.name} for resource #{resource.resource_id} has invalid SearchAggregation '#{resource.search_aggregation}'. Must be one of: #{valid_aggregations.join(', ')}."
+              end


New validation behavior was added for SearchExpression (non-empty string) and SearchAggregation (must be one of MAX/MIN/AVG/SUM and normalized to uppercase), but there are no specs asserting these error cases and normalization. Add RSpec coverage that (1) invalid/blank SearchExpression causes CfnGuardian::ValidationError, and (2) invalid SearchAggregation is rejected while valid lowercase values are normalized.

Copilot · 2026-04-17T02:20:46Z

+| Property | Required | Default | Description |
+|---|---|---|---|
+| `SearchExpression` | Yes | - | A CloudWatch SEARCH() expression string. Supports `${Resource::...}` [variables](variables.md). |
+| `SearchAggregation` | No | `MAX` | Aggregation function applied to the search results. Valid values: `MAX`, `MIN`, `AVG`, `SUM`. |
+


The markdown table under "Properties" is malformed (lines start with ||), which renders as an empty first column in most markdown parsers. Use standard table syntax with a single leading/trailing | per row.

Copilot · 2026-04-17T02:20:46Z

+require 'fileutils'
+require 'tmpdir'
+require 'term/ansicolor'


fileutils and term/ansicolor are required but not used anywhere in this spec. Removing unused requires reduces load time and avoids pulling extra dependencies into the test process unnecessarily.

Suggested change

require 'fileutils'

require 'tmpdir'

require 'term/ansicolor'

require 'tmpdir'

Copilot · 2026-04-17T02:20:46Z

      def add_alarm(alarm)
        actions = alarm.alarm_action.kind_of?(Array) ? alarm.alarm_action.map{|action| Ref(action)} : [Ref(alarm.alarm_action)]
        actions.concat alarm.maintenance_groups.map {|mg| Ref(mg)} if alarm.maintenance_groups.any?
+        use_search = alarm.search_expression ? true : false


use_search is true for any non-nil value, including an empty string. Since '' is truthy in Ruby, this would emit a Metrics-based alarm with a blank Expression, producing an invalid CloudFormation template. Consider basing use_search on alarm.search_expression.is_a?(String) && !alarm.search_expression.strip.empty? (or raising early) to align with validation behavior.

Suggested change

use_search = alarm.search_expression ? true : false

use_search = alarm.search_expression.is_a?(String) && !alarm.search_expression.strip.empty?

Copilot

Pull request overview

Copilot reviewed 11 out of 12 changed files in this pull request and generated 1 comment.

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Copilot · 2026-04-17T03:41:53Z

+                  Expression: "#{aggregation}(search_expression)",
+                  ReturnData: true
+                }
+              ]


When using the CloudWatch Alarm Metrics property (metric math), CloudFormation expects a ThresholdMetricId to indicate which MetricDataQuery is evaluated against Threshold/ComparisonOperator. This block builds queries search_expression and aggregate but never sets ThresholdMetricId, which can cause the alarm to fail creation or evaluate the wrong series. Set ThresholdMetricId to the query id that returns data (e.g. aggregate).

Suggested change

]

]

ThresholdMetricId 'aggregate'

Add CloudWatch SEARCH() expression support for dynamic metric alarms

e2110ca

jaredbrook requested review from Copilot and tarunmenon95 April 17, 2026 01:48

Copilot started reviewing on behalf of jaredbrook April 17, 2026 01:49 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

jaredbrook added 2 commits April 17, 2026 11:57

Add rspec to gemspec dev dependencies and use bundle exec in CI

ba55420

Pin bundler to 2.4.22 for Ruby 2.7 compatibility in CI

e270a44

jaredbrook requested a review from Copilot April 17, 2026 02:03

Copilot started reviewing on behalf of jaredbrook April 17, 2026 02:04 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Address PR review feedback for search expression feature

efb4049

jaredbrook requested a review from Copilot April 17, 2026 02:15

Copilot started reviewing on behalf of jaredbrook April 17, 2026 02:15 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

Address second round of PR review feedback

f274be8

jaredbrook requested a review from Copilot April 17, 2026 03:37

Copilot started reviewing on behalf of jaredbrook April 17, 2026 03:38 View session

Copilot AI reviewed Apr 17, 2026

View reviewed changes

	FileUtils.rm_f(fixture)
	FileUtils.rm_f(fixture)
	FileUtils.rmdir(config_dir) if Dir.exist?(config_dir) && Dir.children(config_dir).empty?

	use_search = alarm.search_expression ? true : false
	use_search = alarm.search_expression.is_a?(String) && !alarm.search_expression.strip.empty?

Conversation

jaredbrook commented Apr 17, 2026

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Uh oh!

Copilot AI Apr 17, 2026

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants